Semi-automatic grammar recovery
نویسندگان
چکیده
We propose an approach to the construction of grammars for existing languages. The main characteristic of the approach is that the grammars are not constructed from scratch but they are rather recovered by extracting them from language references, compilers, and other artifacts. We provide a structured process to recover grammars including the adaptation of raw extracted grammars and the derivation of parsers. The process is applicable to possibly all existing languages for which business critical applications exist. We illustrate the approach with a non-trivial case study. Using our process and some basic tools, we constructed in a few weeks a complete and correct VS COBOL II grammar specification for IBM mainframes. In addition, we constructed a parser for VS COBOL II, and were the first to publish a (web-enabled) grammar specification so that others can use this result to construct their own grammar-based tools for VS COBOL II or derivatives.
منابع مشابه
The Amsterdam Toolkit for Language Archaeology
GRK — the Grammar Recovery Kit — illustrates options for automation and corresponding tool support in the context of developing quality language references that readily cater for the derivation of parsers. GRK provides the proof-of-concept for two notions: (i) semi-automatic grammar recovery; (ii) language-reference re-engineering. GRK’s support for semi-automatic grammar recovery means that GR...
متن کاملTitle : MARS : A Metamodel Recovery System Using Grammar Inference
100 words): Domain-specific modeling (DSM) assists subject matter experts in describing the essential characteristics of a problem in their domain. When a metamodel is lost, repositories of domain models can become orphaned from their defining metamodel. Within the purview of model-driven engineering, the ability to recover the design knowledge in a repository of legacy models is needed. In thi...
متن کاملSemi-automatic acquisition of domain-specific semantic structures
This paper describes a methodology for semi-automatic grammar induction from unannotated corpora belonging to a restricted domain. The grammar contains both semantic and syntactic structures, which are conducive towards language understanding. Our work aims to ameliorate the reliance of grammar development on expert handcrafting or the availability of annotated corpora. To strive for a reasonab...
متن کاملLearning Strategies In A Grammar Induction Framework
This work extends a semi-automatic grammar induction approach previously proposed in [1]. We investigate the use of Information Gain (IG) in place of Mutual Information (MI) for grammar induction based on an unannotated training corpus. Experiments using the ATIS-3 training corpus indicate that the use of IG led to better precision and recall of desired semantic categories and at earlier stages...
متن کاملMARS: A metamodel recovery system using grammar inference
Domain-specific modeling (DSM) assists subject matter experts in describing the essential characteristics of a problem in their domain. Various software artifacts can be generated from the defined models, including source code, design documentation, and simulation scripts. The typical approach to DSM involves the construction of a metamodel, from which instances are defined that represent speci...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Softw., Pract. Exper.
دوره 31 شماره
صفحات -
تاریخ انتشار 2001